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FIGURE 2. Molecular organisation of the gene CG3842 (GadFly Accession 
Number) 
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FIGURE 3. BLASTP RESULTS FOR CG3842 

FIGURE 3A. Homology to human gene ref XM_085058, protein ref XPJ385058.1 

>ref | XP_0BS058 . 1 | (XM_0850SB) similar to unnamed protein product [Homo sapiens] 
dbj|BAB70811.l| (AK054835) unnamed protein product [Homo sapiens) 
Length = 316 

Score = 266 bits (681). Expect = 2e-70 

Identities = 163/317 (51%). Positives - 206/317 (64%). Gaps - 13/317 (4%) 

Query- 4 5 LIVLGILL FMWL I^KCIC^PAYRKANRIIX;KWIVTGChrrciGKETVI.E^ 96 

L+ LG+L F+++ +RK • G R GKW++TG NTGIGKET ELA 

Sbjct: 2 LVTLG LLTS FFS FL YMVAP S I RKF FAGG VCRTNVQLPG KVWT TGANTG I G KETARELAS 61 

Query- 97 RGARVYMACRDPGRCEAARIJDIMDRSIW^ 156 

RGARVY + ACRD + E+A +1 ++N Q+ R LDL +S+R F E F AEE +L I 
Sbjct: 62 RGARVYI ACRDVLKGES AASE I RVDTKNSQVLVRKLDLSDTKS I RAFAEGFLAEEKQLH I 121 

Query- 157 LINNAGVMACPRTLTADGFEQQFGVNHI/SHFLLTNLLL^^ 216 

LINNAGVM CP + TADGFE GVNHLGHFLLT LLL+RLK S+P+R+V VSS AH G 
Sbjct: 122 L I NN AGVMMC P Y S KT ADG F ETH LGVNH LGH F LLT YLLL.ERLKV S AP AR WNV S S V AHH I G 181 

Query- 217 RINREDLMS E KNYS KFFG AYSQS KLANI LFTLKLST I LKDTGVTVNCCH PG WRTE I NRH 276 

+1 DL SEK YS+ F AY SKXAN+LFT +L-I- L+ TGVT HPGWR+E+ RH 
Sbjct: 182 KI PFHDLQSEKRYSRGF - AYCHSKLANVLFTRELAKRLQGTG VTTYAVHPGVVRS ELVRH 240 

Query- 277 FSGPGWMKTALQKGSLYFFKTPKAGAQTQLRIAIDPQLEGSTGGYYSD 336 

s + L+ F KT + GAQT L AL LE +G Y+SDC R + P RN 
Sbjct: 241 SS LLCLLVfitLFSPFVKTAREGiAQTSLHCALAEGLEPLSGKV PRARN 296 

Query: 337 MQTADWLWRESEKLLGL 353 

+TA+ LW S +LLG+ 
Sbjct: 297 NKTAERLWNVSCELLGI 313 



FIGURE 3B. Homology to human gene ref NM_020905, protein ref NP_065956.1 

>ref |NP_06S956.l| (NW_020905> PAN2 protein [Homo sapiens] 
gb|AAG12190.l|AF237952_l (AF237952) PAN 2 [Homo sapiens] 
gb|AAH09B30.l|AAH09830 (BC009830) PAN 2 protein [Homo sapiens] 
Length = 336 

Score « 254 bits (648). Expect = le-66 . 

Identities = 152/319 (47%), Positives = 191/319 (59%). Gaps - 20/319 (6%) 

Query 54 MWLLRKC I QG P AYRKANR IIX3KVVIVTGQJTGIGKETVLEIJUCRGARVYMACRD 107 

+WL + GP ++ R + GK V++TG N+G+G+ T EL + GARV M CRD 

Sbjct: 17 LWLAARRFVG PRVQRLRRGGD P^ LMHG KTVLI TG ANSG LG RATAAE LLRLG ARV I MGCRD 76 

Query- 108 PGRCEAARLDIMDRSRNQ QLFNRTLDLGSLQSVRNFVERFKAEESRL 154 

REA+R +LR LDL SL+SVR F + EE RL 

Sbjct: 77 RARAEEAAGQUIREI^QAAECGPEPGVSGVGELIVREIX>L^ 136 

Query- 155 DILINNAGVMACPRTLTADGFEQQFGVNHLGHFIAT^^ 214 

D+LINNAG+ CP T DGFE QFGVNHLGHFLLTNLLL LK S + PSRIVWSS + 
Sbjct: 137 DVLINNAGIFQCPYMKTEDGFEMQFGVKHI^ 196 

Query- 215 FGRINREDLMS EKNY S KF FG AYS QS KLAN I LFTLKLST I LKDTGVTVNCCH PGWRTE IN 274 

+G IN +DL SE++Y+K F YS+SKLANILFT +L+ L+ T VTVN HPG+VRT + 
Sbjct: 197 YGDIl^DIJJSEQSYNKSF-rcSRSKXANILF^ 255 

Query- 275 RHFSGPGWM1CTAU2KGSLYFFKTPKAGAQTQLRI^ 334 

RH P +K S FFKTP GAQT + LA P++EG +G Y+ DC LP 

Sbjct: 256 RHIHIPIiVXPLFNLVSWAFFKTPVEGAQTSIYLASSPEVEGVSGRYFGDCKEEELLPKA 315 

Query: 335 RNMQTADWLWRESEKLLGL 353 

+ A LW SE ++GL 
Sbjct: 316 MDES VARKLWD I SEVMVGL 334 
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FIGURE 3C. Homology to human gene ref NMJM6026, protein ref NP_057110.1 

Pairwise alignment of Drosophila CG3842 encoded protein (query) and andro gen-regulated 
short-chain dehydrogenase/reductase I; prostate short-chain dehydrogenase reductase I; CGI- 
82 protein [Homo sapiens] (subject) 

>ref |NP_057110.1| (NM_016026) CGI-82 protein; likely ortholog of mouse cell line 

MC/9.IL4 derived transcript 1 [Homo sapiens) 

ref |XP_031073.1 | (XM_031073) CGI-82 protein [Homo sapiens) 

gb|AAD34077 .1 |AF151B40_1 (AF151840) CGI-82 protein [Homo sapiens) 

gb|AAH00112.l|AAH00112 (BC000112) CGI-82 protein [Homo sapiens) 

gbj AAK72049 . lj AF395068_1 (AF3 95068) HCV core-binding protein HCBP12 [Homo sapiens) 
gb| AAH11727 .l|AAH11727 (BC011727) Similar to CGI-82 protein (Homo sapiens) 



Length 


= 318 




Score « 250 bits (638), Expect »" 2e-6S 




Identities 


= 157/314 (50%), Positives = 196/314 (62%), Gaps = 7/314 (2%) 


Query: 


43 


IFLI VLGILLFMWL- - LRKCIQGPAYRKANRIDGKWI VTGCNTGIGKETVLELAKRGAR 


100 






+ L++L LL+M +RK + ++ GKW+VTG NTGIGKET ELA+RGAR 




Sbjct : 


8 


LLLLLLPFLLYMAAPQIRi<mSSGVCTSTVQLPGKVVVVTGANTGIGK^ 


67 


Query: 


101 


VYMACRDPGRCEAARLDIMDRSRNQQLFNRTI^USSL^S^^ 


160 






VY+ACRD + E +1 + NQQ+ R LDL +S+R F + F AEE L +LINN 




Sbjct : 


68 


VYIACRDVEKGELVAKEIQTTTGNQQVLVRKLDLSDTKSIR 


127 


Query : 


161 


AG VMAC PRTLTADG F EQQFG VNHLGHFLLTNLLLDRLKHS S P SRIVWS S AAH LFGR 1 NR 


220 






AGVM- CP + TADGFE GVNHLGHFLLT+LLL++LK S+PSRIV VSS AH GRI + 




Sb j ct : 


128 


AGVWMCPYSKTATCFEMHIGvTWIXSHFXLTHLL 


187 


Query: 


221 


EDLMSEKITCSKFFGAYSQSKLANILFTLKLSTILKDTGVTWCC^ 


280 






+L EK Y+ AY SKLANILFT +L+ LK +GVT HPG V++E+ RH S 




Sbjct: 


188 


HNI^EKFYKAGL-AYCHSKl^ILFTQELARRLKGSGVTTYSVHPGTVQSELVRHSSFM 


246 


Query: 


281 


GWMKTAI^KGSLYFFKTPKAGAQTQLRLALDPQLEGSTGGYYSDCMRWPLFPVAnWMQTA 


340 






WM +F KTP+ GAQT L AL . LE +G ++SDC + RN A 




Sbjct: 


247 


RWMWVTLFS - - - - FF I KT PQQGAQTSLHCALTEGLE I LSGNH FSDCHVAWVSAQARNETIA 


302 


Query: 


341 


DWLWRES EKLLGLP 354 








LW S LLGLP 




Sbj ct : 


303 


RRLWDVSCDLLGLP 316 
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FIGURE 4. CLUSTAL X (1.81) multiple sequence alignment 



CGI - 82 PGKVWVTGANTG I GKET AKE LAQRGARVYLACRDVEKGELVAXE I QTTTGNQ - 

XP_085058 PGKVWI TGANTGI GKE TARE LAS RGARVY I ACRDVLKGESAAS E I RVDTKNS 

eg 3 842 DGKVVIVTGCNTGIGKETVLEUU^GARVYMACRDPGRC 

PAN 2 HGKTVL I TGANSGLGRATAAE LLRLGARVT MGCRDRARAE EAAGQLRRE LRQAAECG PEP 

* : ::**.* *:*: * . * : * . * :.*** : * :: : 

CGI -82 QVLVRKLDLSDTKSIRAFAKGFLAEEK-HLHVLINNAGVMMCPYS - KTADGFEM 

XP_085058 QVLVRKLDLSDTKSIRAFAEGFIAEEK-QLHILINNAGWIMCPYS-KTADGFET 

cg3 842 QLFNRTLDLGSLQSVRNFVERFKAEES -RLDILINNAGVMACPRT- LTADGFEQ 

PAN2 GVSGVGELIVREIiDLASLRSVRAFCQEMLQEEP-RLDVLINNAGI FQCPYM - KTEDGFEM 

... * **■*• *.** . * : * * * * * : * 

CGI - 82 H I GVNHLGHFLLTHIiLLE KLKE S APS RI VNVS S LAHHLGRI HFHNLQGE KFYNAGL - AYC 

XP_085058 HLGVNHIX^FLLTYLLLERLKVSAPARVVNVSSVAHHIGKI PFHDLQSEKRYSRGF-AYC 

cg3842 QFGVNHLGHFLLTNLLI^RLKHSSPSRIVWSSAAHLFGRIbniEDLM^ 
PAN 2 Q FGVNHLGHFLLTNLIiLGLLKS S APS RI VWS S KL YKYGD I NFDDLNS EQS YNKS F - CYS 

*** ** ..*.*.* .** . * * .* + . * 



CGI -82 HSKLANILFTQELARRIjKGSGVTTYSVHPGWQSELVRHSS FMRWMWWLFS 

XP_085 058 HS KIANVLFTRELAKRLQGTGVTT YAVH PGWRS ELVRHS S LLCLLWRLFS 

cg3 842 QSKXANILFTLKLSTILKDTGVTVNCCHPGVVRTEINRHFS G PG WMKTALQ - K-GS 

PAN 2 RSKLANILFTRELARRLEGTNVTVNVLHPGIVRTNLGRHIH IPLLVKPLFN- -LVS 

.**«**.** * :*:.**..:.**. ****:::*: . : 

CGI - 8 2 - FFI KTPQQGAQTSLHCALTEGLE I LSGNHFSDCHVA 

XP_0 8505 8 - PFVKTAREGAQTSLHCALAEGLEPLSGKYFSDCKRT 

eg 3 84 2 LYFFKTPKAGAQTQLRLALDPQLEGSTGGYYSDCMRW 

PAN2 WAFFKTPVEGAQTSIYLASSPEVEGVSGRYFGDCKEE 
; * * * * * * - * .. 
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Figure 5. Sequences of the human proteins of the invention 

Figure 5A. Nucleic acid sequence of human unnamed protein 
XP_085058 (SEQ ED NO:l) 

1 aggactgtat gctgttctta aggactctct gcttcctgga caagctcaag ctaaggacta 

61 catctcccag caggctgtgc tctgacagct cttggattta aataggattc tgggctctgc 

121 tcagagtcag gctgctgctc agcacccagg acggagagga gcagagaagc agcagaagca 

181 gccaagagct ggagccagac caggaacctg agccagagct ggggttgaag ctggagcagc 

241 agcaaaagca acagcagcta cagaagttgg aacgatgctg gtcaccttgg gactgctcac 

301 ctccttcttc tcgttcctgt atatggtagc tccatccatc aggaagttct ttgctggtgg 

361 agtgtgtaga acaaatgtgc agcttcctgg caaggtagtg gtgatcactg gcgccaacac 

421 gggcattggc aaggagacgg ccagagagct cgctagccga ggagcccgag tctatattgc 

481 ctgcagagat gtactgaagg gggagtctgc tgccagtgaa atccgagtgg atacaaagaa 

541 ctcccaggtg ctggtgcgga aattggacct atccgacacc aaatctatcc gagcctttgc 

601 tgagggcttt ctggcagagg aaaagcagct ccatattctg atcaacaatg cgggagtaat 

661 gatgtgtcca tattccaaga cagctgatgg ctttgaaacc cacctgggag tcaaccacct 

721 gggccacttc ctcctcacct acctgctcct ggagcggcta aaggtgtctg cccctgcacg , 

781 ggtggttaat gtgtcctcgg tggctcacca cattggcaag attcccttcc acgacctcca 

841 gagcgagaag cgctacagca ggggttttgc ctattgccac agcaagctgg ccaatgtgct 

901 ttttactcgt gagctggcca agaggctcca aggcaccggg gtcaccacct acgcagtgca 

961 cccaggcgtc gtccgctctg agctggtccg gcactcctcc ctgctctgcc tgctctggcg 

1021 gctcttctcc ccctttgtca agacggcacg ggagggggcg cagaccagcc tgcactgcgc 

1081 cctggctgag ggcctggagc ccctgagtgg caagtacttc agtgactgca agaggacctg 

1141 ggtgtctcca agggcccgaa ataacaaaac agctgagcgc ctatggaatg tcagctgtga 

1201 gcttctagga atccggtggg agtagctggt ggaagagctg cagctttatc aggcccaatc 

1261 catgccataa tgaacaggga ccaaggagaa ggccaaccct aaaggattgt cctcttggcc 

1321 agctggtgct gcgaatcctg cctgctctga tcctcttgac ccttctggga atgtttgcac 

1381 acctgacact cttgtgagac tggcttatgg catgagttgt ggacacctat agagtgttct 

1441 tctctaagac ctggaaagtc agcaaccctc tgggggcagc aggactgggc agatcccagg 

1501 ctgggcatgg gggtggcaga agagcccgag aaattgggtc agttccctca tcagcaccag 

1561 aggctcagct gaggcaagaa gagcaccatc actgcctatt tctaggggct atacactcca 

1621 actcttggtt gatctctttc tttttaaaaa tatttgccac caccctggag tctagaccaa 

1681 cacacaaaga tcctggctaa ccctggccta tttagattcc ttcctctcac ctggaccttc 

1741 ccatttcaat catgcagatg gtttcttttt' gtaaagagtt ccgtttgcct ttcaattttt 
1801 agagaaaata aagactgcat tcatct 

Figure 5B. Amino acid sequence of human unnamed protein XP_085058 
(SEQ ID NO:2) 



1 mlvtlgllts ffsflymvap sirkffaggv 

61 srgarvyiac rdvlkgesaa seirvdtkns 

121 ilinnagvmm cpysktadgf ethlgvnhlg 

181 gkipfhdlqs ekrysrgfay chsklanvlf 

241 ssllcllwrl fspfvktare gaqtslhcal 

301 erlwnvscel lgirwe 



crtnvqlpgk wvitgantg igketarela 
qvlvrkldls dtksirafae gflaeekqlh 
hflltyllle rlkvsaparv vnvssvahhi 
trelakrlqg cgvttyavhp gwrselvrh 
aegleplsgk yfsdckrtwv spramnkta 
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Figure 5C. Nucleic acid sequence of human protein CGI-82 (SEQ ID 
NO:3) 

1 gctggagcat cccgctctgg tgccgctgca gccggcagag atggttgagc tcatgttccc 
61 gctgttgctc ctccttctgc ccttccttct gtatatggct gcgccccaaa tcaggaaaat 
121 gctgtccagt ggggtgtgta catcaactgt tcagcttcct gggaaagtag ttgtggtcac 
181 aggagctaat acaggtatcg ggaaggagac agccaaagag ctggctcaga gaggagctcg 
241 agtatattta gcttgccggg atgtggaaaa gggggaattg gtggccaaag agatccagac 
3 01 cacgacaggg aaccagcagg tgttggtgcg gaaactggac ctgtctgata ctaagtctat 
• 361 tcgagctttt gctaagggct tcttagctga ggaaaagcac ctccacgttt tgatcaacaa 
421 tgcaggagtg atgatgtgtc cgtactcgaa gacagcagat ggctttgaga tgcacatagg 
481 agtcaaccac ttgggtcact tcctcctaac ccatctgctg ctagagaaac taaaggaatc 
541 agccccatca aggatagtaa atgtgtcttc cctcgcacat cacctgggaa ggatccactt 
601 ccataacctg cagggcgaga aattctacaa tgcaggcctg gcctactgtc acagcaagct 
661 agccaacatc ctcttcaccc aggaactggc ccggagacta aaaggctctg gcgttacgac 
721 gtattctgta .caccctggca cagtccaatc tgaactggtt cggcactcat ctttcatgag 
781 atggatgtgg tggcttttct cctttttcat caagactcct cagcagggag cccagaccag 
841 cctgcactgt gccttaacag aaggtcttga gattctaagt gggaatcatt tcagtgactg 
901 tcatgtggca tgggtgtctg tccaagctcg taatgagact atagcaaggc ggctgtggga 
961 cgtcagttgt gacctgctgg gcctcccaat agactaacag gcagtgccag ttggacccaa 
1021 gagaagactg cagcagacta cacagtactt cttgtcaaaa tgattctcct tcaaggtttt 
1081 caaaaccttt agcacaaaga gagcaaaacc ttccagcctt gcctgcttgg tgtccagtta 
1141 aaactcagtg tactgccaga ttcgtctaaa tgtctgtcat gtccagattt actttgcttc 
1201 tgttactgcc agagttacta gagatatcat aataggataa gaagaccctc atatgacctg 
1261 cacagctcat tttccttctg aaagaaacta ctacctagga gaatctaagc tatagcaggg 
1321 atgatttatg caaatttgaa ctagcttctt tgttcacaat tcagttcctc ccaaccaacc 
1381 agtcttcact tcaagagggc cacactgcaa cctcagctta acatgaataa caaagactgg 
1441 ctcaggagca gggcttgccc aggcatggtg gatcaccgga ggtcagtagt tcaagaccag 
1501 cctggccaac atggtgaaac cccacctcta ctaaaaattg tgtatatctt tgtgtgtctt 
1561 cctgtttatg tgtgccaagg gagtattttc acaaagttca aaacagccac aataatcaga 
1621 gatggagcaa accagtgcca tccagtcttt atgcaaatga aatgctgcaa agggaagcag 
1681 attctgtata tgttggtaac tacccaccaa gagcacatgg gtagcaggga agaagtaaaa 
1741 aaagagaagg agaatactgg aagataatgc acaaaatgaa gggactagtt aaggattaac 
1801 tagcccttta aggattaact agttaaggat taatagcaaa agatattaaa tatgctaaca 
1861 tagctatgga ggaattgagg gcaagcaccc aggactgatg aggtcttaac aaaaaccagc 
1921 gtggcaaaaa aaaaaaaaaa aaaaaaaaaa aaaaatccta aaaacaaaca aacaaaaaaa 
1981 acaattcttc attcagaaaa attatcttag ggactgatat tggtaattat ggtcaattta 
2041 ataatatttt ggggcatttc cttacattgt cttgacaaga ttaaaatgtc tgtgccaaaa 
2101 ttttgtattt tatttggaga cttcttatca aaagtaatgc tgccaaagga agtctaagga 
2161 attagtagtg ttcccatcac ttgtttggag tgtgctattc taaaagattt tgatttcctg 
2221 gaatgacaat tatattttaa ctttggtggg ggaaagagtt ataggaccac agtcttcact 
2281 tctgatactt gtaaattaat cttttattgc acttgttttg accattaagc tatatgttta 
2341 gaaatggtca ttttacggaa aaattagaaa aattctgata atagtgcaga ataaatgaat 
2401 taatgtttta cttaatttat attgaactgt caatgacaaa taaaaattct ttttgattat 
2461 tttttgtttt catttaccag aataaaaact aagaattaaa agtttgatta cagtcaaaaa 
2521 aaaaaaaaaa aaaaaaaa 

Figure 5D. Amino acid sequence of human protein CGI-82 
(SEQ ID NO:4) 

1 mvelmfplll lllpfllyma apqirkmlss gvctstvqlp gkvwvtgan tgigketake 
61 laqrgarvyl acrdvekgel vakeiqtttg nqqvlvrkld Isdtksiraf akgflaeekh 
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121 lhvlinnagv mmcpysktad gfemhigvnh 

181 hlgrihfhnl qgekfynagl aychsklani 

241 rhssfmrwmw wlfsffiktp qqgagtslhc 

3 01 iarrlwdvsc dllglpid 
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lghfllthll leklkesaps rivnvsslah 
lftqelarrl kgsgvttysv hpgtvqselv 
altegleils gnhfsdchva wvsvqarnet 
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